Skip to content

Upgrade Elasticsearch from 7.x to 8.18.0 #6640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

smithellis
Copy link
Contributor

@smithellis smithellis commented Apr 30, 2025

This commit upgrades our Elasticsearch infrastructure from version 7.x to 8.18.0,
including:

  • Import both elasticsearch7 and elasticsearch8
  • Poll the server to determine version, and use correct one
  • Removed elasticsearch-dsl dependency in favor of elasticsearch.dsl when on ES8
  • Modified ES connection settings and client configuration options
  • Updated ES module imports from elasticsearch_dsl to elasticsearch.dsl
  • Renamed test_es7_utils.py to test_es_utils.py with updated test cases
  • Added new ES client configuration options in settings.py
  • Fixed ES URL prefix to use http:// protocol explicitly
  • Made necessary code adjustments throughout search modules for compatibility

This upgrade provides better security, performance improvements, and access to newer
Elasticsearch features while maintaining backward compatibility with our existing
search functionality.

NOTE: We will need to plan and execute an upgrade path with hosted Elastic; this PR only solves code compatibility caused by breaking changes.

@smithellis smithellis mentioned this pull request Apr 30, 2025
)

# Handle special case where an index with same name as alias exists
try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for adding this check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can drop this; I added this when troubleshooting test issues with index and alias collisions. I'll see if dropping it causes problems to recur.

"request_timeout": settings.ES_TIMEOUT,
"retry_on_timeout": settings.ES_RETRY_ON_TIMEOUT,
# SSL settings - these are needed for ES8 which requires SSL by default
"verify_certs": settings.ES_VERIFY_CERTS,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the reason for introducing all these new settings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ES8 requires the SSL settings. retry_on_timeout was an attempt to improve reliability when troubleshooting test issues; the same with the sniff_on_start and sniff_on_connection_fail.
We can drop the sniff_* settings, but should consider leaving the retry_on_timeout. May not be critical but it could help when/if there are real connection issues.

"sniff_on_connection_fail": settings.ES_SNIFF_ON_CONNECTION_FAIL,
}

if settings.ES_HTTP_AUTH:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Authentication isn't always needed, and since it seems conditional it's pulled out on its own. That separates auth stuff from the more consistent/base connection settings, which we will always have.
I think it's a good pattern but am open to feedback.
That said, I don't think we have a case where we don't use auth.

@@ -134,10 +159,15 @@ def index_object(doc_type_name, obj_id):
# just return
return

kwargs = {}
# For ES8, use string "true" instead of boolean True for refresh parameter
if settings.TEST:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already have this check above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are three checks for settings.TEST in the file. The es_client has one to modify timeout and retry as part of local testing, index_object() and delete_object() both test to set the refresh to true. It didn't seem reasonable to pull that check into a helper function just for the second two tests. I don't think we want refresh set to true generally, as it forces an immediate refresh which would be expensive in prod.
I think we could remove the code I added for troubleshooting/resolving tests (so the es_client() timeout, retries and retry_on_timeout) and that would get rid of the test in es_client(). But we want the test in the other two places for sure.

kwargs = {}
# For ES8, use string "true" instead of boolean True for refresh parameter
if settings.TEST:
kwargs["refresh"] = "true"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It forces ES to refresh, making changes immediately visible to search. So it helps for local testing more than anything else.

@@ -7,6 +7,7 @@

class ForumDocumentSignalsTests(Elastic7TestCase):
def setUp(self):
super().setUp()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that we need this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I changed the ElasticTestCase it is needed. Now we set the test environment up cleanly, force a refresh, and get in front of any index issues.

@@ -46,9 +48,16 @@ def make_first_doc_throw_exception(self, *args, **kwargs):
question = Question.objects.get(id=question_id)
AnswerFactory(question=question, content=f"answer {question_id}")

# Force ES index refresh before testing
QuestionDocument._index.refresh()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you introducing this change? Did something change between 7 and 8?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just more ensuring that tests don't fail because of a delay between adding something to the index and it appearing in search results. So, tests won't fail due to timing issues.

@smithellis smithellis force-pushed the es818-update branch 3 times, most recently from 8fcee02 to bef0b8a Compare May 4, 2025 14:23
7 and 8.

We import both python libraries `elasticsearch7` and `elasticsearch8`
We test the server version and set it in `settings.py`
We use the correct library based on the server major version
Server vwrsion check is now in `search/utils.py` and uses
module level caching to prevent unneeded connections
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants